Skip to content

Automated Test: embed-url-handling-post #327

Closed

Conversation

admin-coderabbit
Copy link
Owner

@admin-coderabbit admin-coderabbit commented Feb 4, 2026

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

  • New Features
    • Added support for embedding forum discussions on external websites with responsive iframe integration
    • Added automated RSS/Atom feed polling to import articles as forum topics
    • Introduced new site settings to configure embedding hosts, feed URLs, and import users

@coderabbit-eval
Copy link

coderabbit-eval bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a discussion embedding feature for Discourse. It adds gems for feed handling, creates a new EmbedController with authentication, defines TopicEmbed model for tracking embedded topics, implements background jobs for RSS polling and topic retrieval, adds client-side embedding JavaScript, and introduces configuration settings for embedding hosts and feed sources.

Changes

Cohort / File(s) Summary
Dependencies
Gemfile
Added ruby-readability and simple-rss gems with require: false for feed importing and readability extraction.
Configuration & Routing
config/routes.rb, config/site_settings.yml, config/locales/server.en.yml, config/locales/client.en.yml
Added embed route, site settings for embeddable_host/feed_polling_enabled/feed_polling_url/embed_by_username, and localization keys for embed UI strings and admin category label.
Frontend Assets
app/assets/javascripts/embed.js, app/assets/stylesheets/embed.css.scss, app/views/layouts/embed.html.erb, app/views/embed/best.html.erb, app/views/embed/loading.html.erb
Created embed layout with cross-origin messaging for iframe height adjustment, embed view for displaying best topic posts with author/content styling, and loading view with auto-refresh.
Core Models
app/models/post.rb, app/models/topic_embed.rb
Added cook_methods enum to Post; introduced TopicEmbed model with import/import_remote/absolutize_urls methods for creating topics from external URLs and RSS feeds.
Controllers
app/controllers/embed_controller.rb
New EmbedController with best action that validates embeddable_host, checks referer origin, retrieves topic view, and queues background retrieval if needed.
Background Jobs
app/jobs/regular/retrieve_topic.rb, app/jobs/scheduled/poll_feed.rb
Added RetrieveTopic job to fetch topics via TopicRetriever; added PollFeed scheduled job to parse RSS/Atom feeds and import topics via TopicEmbed.
Libraries & Utilities
lib/topic_retriever.rb, lib/post_creator.rb, lib/post_revisor.rb, lib/tasks/disqus.thor
Added TopicRetriever class with host validation and throttling; updated PostCreator to propagate cook_method; updated PostRevisor to support skip_validations; refactored Disqus import to use TopicEmbed.import_remote.
Database Migrations
db/migrate/20131210181901_migrate_word_counts.rb, db/migrate/20131217174004_create_topic_embeds.rb, db/migrate/20131219203905_add_cook_method_to_posts.rb, db/migrate/20131223171005_create_top_topics.rb
Created topic_embeds table with unique embed_url index; added cook_method integer column to posts; whitespace fix in word_counts migration; added force: true to top_topics creation.
Tests
spec/controllers/embed_controller_spec.rb, spec/jobs/poll_feed_spec.rb, spec/models/topic_embed_spec.rb, spec/components/topic_retriever_spec.rb
Added test coverage for EmbedController best action validation, PollFeed job gating, TopicEmbed import/validation/URL handling, and TopicRetriever host/throttle logic.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant EmbedController
    participant TopicRetriever
    participant TopicEmbed
    participant Database
    
    Client->>EmbedController: GET /embed/best?embed_url=...
    EmbedController->>EmbedController: Validate embeddable_host & referer
    EmbedController->>TopicEmbed: topic_id_for_embed(embed_url)
    TopicEmbed->>Database: Query topic_embeds
    Database-->>TopicEmbed: Return topic_id or nil
    TopicEmbed-->>EmbedController: Return topic_id
    
    alt Topic exists
        EmbedController->>EmbedController: Create TopicView
        EmbedController-->>Client: Render best.html.erb
    else Topic not found
        EmbedController->>TopicRetriever: Queue retrieve job
        EmbedController-->>Client: Render loading.html.erb
        
        TopicRetriever->>TopicRetriever: Validate host & throttle
        TopicRetriever->>TopicEmbed: import_remote(user, url)
        TopicEmbed->>TopicEmbed: Fetch via Readability
        TopicEmbed->>TopicEmbed: Absolutize URLs
        TopicEmbed->>Database: Create Topic & Post
        Database-->>TopicEmbed: Return created post
    end
Loading
sequenceDiagram
    participant Scheduler
    participant PollFeed
    participant SimpleRSS
    participant TopicEmbed
    participant Database
    
    Scheduler->>PollFeed: execute()
    PollFeed->>PollFeed: Check feed_polling_enabled, feed_polling_url, embed_by_username
    
    alt Settings valid
        PollFeed->>PollFeed: Resolve embedding user by username
        PollFeed->>SimpleRSS: Parse feed from feed_polling_url
        SimpleRSS-->>PollFeed: Return feed items
        
        loop For each feed item
            PollFeed->>PollFeed: Extract URL, title, content
            PollFeed->>PollFeed: Unescape HTML entities
            PollFeed->>TopicEmbed: import(user, url, title, content)
            TopicEmbed->>Database: Create/Update topic & post
            Database-->>TopicEmbed: Confirm
        end
    else Settings missing
        PollFeed->>PollFeed: Return early
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Poem

🐰 A feature hops in with feed and embed,
Cross-origin messages dance overhead,
Topics now travel to distant lands,
RSS feeds flow through caring hands,
Discourse takes flight on the web so grand! 🚀

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Automated Test: embed-url-handling-post' is vague and generic, using non-descriptive terms that don't convey meaningful information about the actual changeset. Replace the generic title with a specific, descriptive summary of the main change, such as 'Add embed functionality with RSS feed polling and topic retrieval' or 'Implement discussion embed feature with feed integration'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch embed-url-handling-post

Comment @coderabbit eval help to get the list of available commands and usage tips.

@coderabbit-eval coderabbit-eval bot closed this Feb 4, 2026
Copy link

@coderabbit-eval coderabbit-eval bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 15

🤖 Fix all issues with AI agents
In `@app/assets/javascripts/embed.js`:
- Around line 5-12: Guard against a missing container and ensure the discourse
URL has a trailing slash: check the result of
document.getElementById('discourse-comments') (variable comments) is non-null
before calling comments.appendChild(iframe) and bail or log if missing; also
normalize discourseUrl when building iframe.src (the variable used with
discourseEmbedUrl) so it always ends with '/' (or insert one when concatenating)
to avoid malformed URLs. Ensure references: comments, iframe, discourseUrl,
discourseEmbedUrl, document.getElementById, and appendChild.

In `@app/controllers/embed_controller.rb`:
- Around line 24-31: The ensure_embeddable method currently sets a non-standard
X-Frame-Options value and enforces a brittle Referer host check; remove the
response.headers['X-Frame-Options'] = "ALLOWALL" line (do not emit an invalid
X-Frame-Options value) and instead, if you need explicit framing rules, set a
proper Content-Security-Policy frame-ancestors header or omit frame headers
entirely; also make the referer validation (the URI(request.referer || '').host
comparison against SiteSetting.embeddable_host in ensure_embeddable) a soft
check—treat missing/invalid Referer as allowed (or log and continue) rather than
raising Discourse::InvalidAccess, since Referer is spoofable/omitted and should
not be a hard security boundary.
- Around line 8-20: The cache header discourse_expires_in 1.minute is applied
unconditionally in the best action, causing the loading view to be cached; move
the discourse_expires_in call into the branch where
TopicEmbed.topic_id_for_embed(embed_url) returns a topic_id (i.e., inside the if
topic_id block, after assigning `@topic_view`) so only the real topic response is
cached, or alternatively set a much shorter expiry when rendering 'loading' in
the else branch (where Jobs.enqueue(:retrieve_topic, ...) and render 'loading'
occur) to avoid caching the loading page for a full minute.

In `@app/jobs/scheduled/poll_feed.rb`:
- Around line 35-36: i.content may be nil causing .scrub to raise; update the
extraction before calling CGI.unescapeHTML and TopicEmbed.import to guard
against nil by selecting the first available body (prefer i.content, fallback to
i.description, then i.summary), call .to_s or ensure non-nil before .scrub, and
then pass the sanitized content variable into TopicEmbed.import(user, url,
i.title, content). Ensure the symbol references i.content, i.description,
i.summary, .scrub, CGI.unescapeHTML, and TopicEmbed.import are used so the
change is applied at the correct spot.
- Line 29: Replace the unsafe Kernel#open call in the SimpleRSS.parse call with
an explicit URI.open/OpenURI.open_uri call to avoid command injection; update
the line using SimpleRSS.parse open(SiteSetting.feed_polling_url) to use
URI.open(SiteSetting.feed_polling_url) (or OpenURI.open_uri) and add require
'open-uri' at the top of poll_feed.rb if not already present so the intent is
explicit and Brakeman warnings are resolved.

In `@app/models/topic_embed.rb`:
- Line 48: The code uses open(url) to fetch external content which allows
command execution and SSRF; in app/models/topic_embed.rb replace Kernel#open
with URI.open and add URL validation before fetching: parse the url, ensure an
allowed scheme (http or https), reject or whitelist hosts/IPs (prevent private
IPs/metadata addresses), and only then call URI.open to read the body for
Readability::Document.new (reference: Readability::Document and the local
variable url). Ensure exceptions from invalid URLs or blocked destinations are
handled and logged.
- Line 13: Replace the in-place mutation and raw HTML interpolation: stop using
the destructive operator on contents (change the `contents << ...` usage in
topic_embed.rb to build a new string instead, e.g. assign back with `contents =
contents + ...` or `contents += ...`) and HTML-escape the `url` before embedding
it into the anchor tag (use ERB::Util.html_escape or CGI.escapeHTML on `url` and
insert that escaped value into the generated string). Ensure the final string
you assign is the safe HTML you intend to store/return.

In `@app/views/embed/best.html.erb`:
- Line 6: The ERB template has an invalid Ruby post-condition clause "<%- end if
%>" which causes a SyntaxError; replace that token with a standard ERB block
terminator "<% end %>" to correctly close the preceding if block (i.e., change
"<%- end if %>" -> "<% end %>" in the template).
- Around line 15-18: The img tag currently interpolates
post.user.small_avatar_url directly into src which can break HTML if the URL
contains single quotes or malicious characters; replace the raw tag with Rails'
safe helper by using image_tag(post.user.small_avatar_url, alt:
post.user.username) or at minimum wrap the value with ERB escaping
(h(post.user.small_avatar_url)) so the src attribute is properly escaped; update
the view around the author block (where post.user and small_avatar_url are used)
to use image_tag or h(...) and include an alt attribute for accessibility.
- Line 19: The embed view renders unsanitized HTML because the raw_html cook
method (Post.rb:133) returns raw content and TopicEmbed.import (called from
poll_feed.rb:36) consumes RSS content without sanitization; fix by ensuring
external HTML is sanitized via Discourse's Sanitizer before it reaches
TopicEmbed.import or by removing/replacing the raw_html cook method so the
normal sanitization pipeline is used. Locate the raw_html method in Post.rb and
either delete/disable it or change callers (e.g., in poll_feed.rb and
TopicEmbed.import) to call Sanitizer.sanitize (or the appropriate
Discourse::Sanitization API) on the decoded RSS content before passing to
TopicEmbed.import/TopicEmbed.import_remote, and confirm post.cooked is produced
by the standard, sanitized cook pipeline rather than returned raw.

In `@app/views/embed/loading.html.erb`:
- Around line 6-11: The page currently does an unbounded reload via the IIFE
that calls setTimeout -> document.location.reload(); modify this logic to
implement a bounded retry counter (e.g., stored in sessionStorage under a key
like "embedReloadAttempts" or encoded as a query parameter) so the IIFE reads
the counter, increments it on each scheduled reload, and only calls
document.location.reload() while attempts < MAX_ATTEMPTS (then stop or show an
error). Ensure the same unique symbols are updated: the self-invoking function,
the setTimeout invocation, and the document.location.reload() call so the reload
only happens up to the configured limit and the counter is persisted across
reloads.

In `@app/views/layouts/embed.html.erb`:
- Around line 6-15: The template is interpolating request.referer directly into
the JS postMessage call which risks XSS and breaks when nil; update the
embed.html.erb code that calls parent.postMessage to JS-escape the referer using
Rails' escape_javascript (or the j helper) when building the target origin and
provide a sensible fallback when request.referer is nil (for example use "*" or
derive origin from the embeddable_host site setting) so parent.postMessage
receives a valid, safely-escaped origin string.

In `@db/migrate/20131217174004_create_topic_embeds.rb`:
- Line 3: The migration currently uses create_table :topic_embeds, force: true
which will drop an existing table silently; remove the force: true option from
the create_table call (so that create_table :topic_embeds do |t| ...) to prevent
destructive automatic drops and allow the migration to fail if the table already
exists; if you want safe idempotence instead, replace the force option by
guarding with a conditional check like table_exists?(:topic_embeds) or wrap the
creation in a conditional, updating the migration method that contains
create_table :topic_embeds to ensure it no longer uses force: true.

In `@db/migrate/20131223171005_create_top_topics.rb`:
- Line 3: The migration currently passes force: true to create_table :top_topics
which will silently drop and recreate the table (risking data loss); remove
force: true and instead guard creation (e.g., check table_exists?(:top_topics)
and only call create_table when it does not exist, or use create_table with
if_not_exists: true if your Rails version supports it), keeping the table
name/top_topics and the existing create_table block unchanged otherwise.

In `@lib/topic_retriever.rb`:
- Around line 27-30: Replace the non-atomic two-step lock (calling
$redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key, 60)) with a
single atomic Redis SET using the EX and NX options so the TTL is applied
atomically; locate the block around the retrieved_key check in
lib/topic_retriever.rb and change the logic to use $redis.set with EX:60 and NX
semantics (treat success the same as setnx returning true) so the key cannot be
left without a TTL if the process crashes.
🧹 Nitpick comments (19)
app/assets/stylesheets/embed.css.scss (1)

14-36: Extract the shared author-column width into an SCSS variable.

The 92px width appears in both .author (line 16) and .cooked margin-left (line 31). If one changes without the other, the layout breaks. A shared variable eliminates this risk.

♻️ Suggested refactor
+$author-column-width: 92px;
+
 article.post {
   border-bottom: 1px solid `#ddd`;

   .post-date {
     float: right;
     color: `#aaa`;
     font-size: 12px;
     margin: 4px 4px 0 0;
   }

   .author {
     padding: 20px 0;
-    width: 92px;
+    width: $author-column-width;
     float: left;

     text-align: center;

     h3 {
       text-align: center;
       color: `#4a6b82`;
       font-size: 13px;
       margin: 0;
     }
   }

   .cooked {
     padding: 20px 0;
-    margin-left: 92px;
+    margin-left: $author-column-width;

     p {
       margin: 0 0 1em 0;
     }
   }
 }
Gemfile (1)

209-211: Version pinning for new gems is optional and consistent with this Gemfile's pattern.

ruby-readability (0.7.3) and simple-rss (2.1.0) are both stable, well-maintained gems with millions of downloads—not abandonment-ware. While pinning versions is a best practice, most gems in this Gemfile don't have version constraints, so leaving these unpinned is consistent with the project's existing approach.

app/assets/javascripts/embed.js (1)

15-24: Origin check is loose — consider a stricter comparison.

discourseUrl.indexOf(e.origin) accepts any origin that is a substring of discourseUrl. While browser-set e.origin limits exploitation, a strict comparison against the parsed origin of discourseUrl would be more robust.

♻️ Suggested stricter origin check
  function postMessageReceived(e) {
    if (!e) { return; }
-   if (discourseUrl.indexOf(e.origin) === -1) { return; }
+   var a = document.createElement('a');
+   a.href = discourseUrl;
+   if (a.origin !== e.origin) { return; }

    if (e.data) {
db/migrate/20131217174004_create_topic_embeds.rb (1)

4-5: Consider adding an index on topic_id for efficient lookups.

Queries joining or filtering by topic_id (e.g., finding the embed record for a given topic) will benefit from an index. post_id may also warrant one depending on query patterns.

♻️ Add index
    add_index :topic_embeds, :embed_url, unique: true
+   add_index :topic_embeds, :topic_id
config/routes.rb (1)

245-246: Route addition looks correct.

The route maps cleanly to embed#best. Note: Brakeman flags that Rails 3.2.12 with globbing routes is vulnerable to directory traversal and RCE (CVE-2014-0130). While this specific route doesn't use globbing, upgrading Rails to at least 3.2.18 is strongly recommended to mitigate this framework-level vulnerability.

spec/jobs/poll_feed_spec.rb (2)

12-28: Negative tests may not isolate each condition independently.

Each test stubs only one setting but leaves the others unstubbed (relying on their defaults). If execute short-circuits on the first falsy condition, then e.g. the feed_polling_url test (line 18–22) might never actually check feed_polling_url because feed_polling_enabled? already defaults to false. Consider stubbing the other two settings to their "happy path" values in each negative test so that only the condition under test causes the early return.

Example for the `feed_polling_url` test
     it "requires feed_polling_url" do
+      SiteSetting.stubs(:feed_polling_enabled?).returns(true)
       SiteSetting.stubs(:feed_polling_url).returns(nil)
+      SiteSetting.stubs(:embed_by_username).returns(embed_by_username)
       poller.expects(:poll_feed).never
       poller.execute({})
     end

Apply the same pattern to the other two negative tests.


5-6: Extra empty lines flagged by RuboCop.

Static analysis flagged extra empty lines at block body beginning (line 5) and end (line 39). Remove them to satisfy Layout/EmptyLinesAroundBlockBody.

Proposed fix
 describe Jobs::PollFeed do
-
   let(:poller) { Jobs::PollFeed.new }
     end
   end
-
 end

Also applies to: 39-40

lib/post_creator.rb (1)

212-216: cook_method could be consolidated into the pass-through attributes list.

Line 216 duplicates the exact pattern used in the loop on lines 212–214. You could simply add :cook_method to the array to reduce repetition.

Proposed change
-    [:post_type, :no_bump, :cooking_options, :image_sizes, :acting_user, :invalidate_oneboxes].each do |a|
+    [:post_type, :no_bump, :cooking_options, :image_sizes, :acting_user, :invalidate_oneboxes, :cook_method].each do |a|
       post.send("#{a}=", `@opts`[a]) if `@opts`[a].present?
     end
-
-    post.cook_method = `@opts`[:cook_method] if `@opts`[:cook_method].present?
spec/components/topic_retriever_spec.rb (2)

21-25: Invalid-URL test doesn't stub embeddable_host — may pass for the wrong reason.

If TopicRetriever#retrieve checks embeddable_host before parsing the URL, this test passes because the host check fails (nil), not because the URL is invalid. Stub embeddable_host to a value so the URL-validation path is actually exercised.

Proposed fix
   it "does not call perform_retrieve when the embed url is not a url" do
+    SiteSetting.expects(:embeddable_host).returns("eviltrout.com")
     r = TopicRetriever.new("not a url")
     r.expects(:perform_retrieve).never
     r.retrieve
   end

5-6: Extra empty lines flagged by RuboCop.

Remove the extra blank lines at block body beginning (line 5) and end (lines 43, 45) per Layout/EmptyLinesAroundBlockBody.

Also applies to: 43-46

spec/controllers/embed_controller_spec.rb (2)

35-38: Using after for assertions is an anti-pattern.

Assertions in after blocks run even when the test body raises, producing confusing double-failures. Move these checks into each it block (or use a shared after only for teardown).

Proposed refactor — inline assertions
-      after do
-        response.should be_success
-        response.headers['X-Frame-Options'].should == "ALLOWALL"
-      end
-
       it "tells the topic retriever to work when no previous embed is found" do
         TopicEmbed.expects(:topic_id_for_embed).returns(nil)
         retriever = mock
         TopicRetriever.expects(:new).returns(retriever)
         retriever.expects(:retrieve)
         get :best, embed_url: embed_url
+        response.should be_success
+        response.headers['X-Frame-Options'].should == "ALLOWALL"
       end

       it "creates a topic view when a topic_id is found" do
         TopicEmbed.expects(:topic_id_for_embed).returns(123)
         TopicView.expects(:new).with(123, nil, {best: 5})
         get :best, embed_url: embed_url
+        response.should be_success
+        response.headers['X-Frame-Options'].should == "ALLOWALL"
       end

4-5: Extra empty lines flagged by RuboCop.

Lines 4, 30, 54, and 57 have extra blank lines per Layout/EmptyLinesAroundBlockBody. Remove them to satisfy the convention.

Also applies to: 54-58

spec/models/topic_embed_spec.rb (3)

34-34: start_with? assertion gives opaque failure messages.

If this assertion fails, RSpec will only report "expected true, got false" without showing the actual cooked value. Use should start_with(...) or should include(...) instead so failures display the actual vs. expected content.

Proposed fix
-        post.cooked.start_with?("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">").should be_true
+        post.cooked.should start_with("hello world new post <a href=\"http://eviltrout.com/hello\">hello</a> <img src=\"http://eviltrout.com/images/wat.jpg\">")

3-48: Missing test coverage for import_remote, topic_id_for_embed, and absolutize_urls.

The AI summary and other specs (embed_controller_spec, topic_retriever_spec) reference TopicEmbed.topic_id_for_embed and TopicEmbed.import_remote, but neither has a dedicated test here. absolutize_urls is only tested indirectly. Consider adding direct unit tests for these class methods, especially topic_id_for_embed which is used in the controller.

Would you like me to generate test stubs for these methods or open an issue to track this?


4-5: Extra empty lines flagged by RuboCop.

Multiple Layout/EmptyLinesAroundBlockBody violations at lines 4, 10, 12, 43, 45, and 47. Remove the extra blank lines.

Also applies to: 10-12, 43-48

app/jobs/regular/retrieve_topic.rb (1)

1-1: Unused require_dependency.

email/sender is never referenced in this file. This appears to be a copy-paste artifact.

Proposed fix
-require_dependency 'email/sender'
 require_dependency 'topic_retriever'
app/jobs/scheduled/poll_feed.rb (1)

20-22: feed_key is defined but never called — dead code.

This method computes a cache key but is not referenced anywhere in this file or, per the summary, elsewhere. If it's intended for future use (e.g., ETag/Last-Modified caching), consider removing it until then.

lib/topic_retriever.rb (1)

40-43: Synchronously executing the entire PollFeed job is expensive and fragile.

Jobs::PollFeed.new.execute({}) runs the full RSS poll inline—fetching and importing all feed items—just to check if one specific embed_url happens to be in the feed. Any error during the poll (network timeout, malformed feed) will propagate up and abort the single-topic retrieval. Consider either enqueuing PollFeed asynchronously or extracting a targeted lookup for the specific URL.

app/models/topic_embed.rb (1)

1-1: Use require instead of require_dependency for gems.

require_dependency is a Rails autoloading mechanism for application code. Gems like Nokogiri should be loaded with plain require.

Proposed fix
-require_dependency 'nokogiri'
+require 'nokogiri'

Comment on lines +5 to +12
var comments = document.getElementById('discourse-comments'),
iframe = document.createElement('iframe');
iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
iframe.id = 'discourse-embed-frame';
iframe.width = "100%";
iframe.frameBorder = "0";
iframe.scrolling = "no";
comments.appendChild(iframe);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential NPE if #discourse-comments element is missing.

If document.getElementById('discourse-comments') returns null, the appendChild call on Line 12 will throw a TypeError. Consider adding a guard.

Also, Line 7: if discourseUrl doesn't end with a trailing /, the constructed URL will be malformed (e.g., "https://example.comembed/best?...").

🐛 Proposed fix
  var comments = document.getElementById('discourse-comments'),
      iframe = document.createElement('iframe');
- iframe.src = discourseUrl + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
+ iframe.src = discourseUrl + (discourseUrl[discourseUrl.length-1] === '/' ? '' : '/') + "embed/best?embed_url=" + encodeURIComponent(discourseEmbedUrl);
  iframe.id = 'discourse-embed-frame';
  iframe.width = "100%";
  iframe.frameBorder = "0";
  iframe.scrolling = "no";
- comments.appendChild(iframe);
+ if (comments) { comments.appendChild(iframe); }
🤖 Prompt for AI Agents
In `@app/assets/javascripts/embed.js` around lines 5 - 12, Guard against a missing
container and ensure the discourse URL has a trailing slash: check the result of
document.getElementById('discourse-comments') (variable comments) is non-null
before calling comments.appendChild(iframe) and bail or log if missing; also
normalize discourseUrl when building iframe.src (the variable used with
discourseEmbedUrl) so it always ends with '/' (or insert one when concatenating)
to avoid malformed URLs. Ensure references: comments, iframe, discourseUrl,
discourseEmbedUrl, document.getElementById, and appendChild.

Comment on lines +8 to +20
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)

if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end

discourse_expires_in 1.minute
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

best action: cache header is set even on the loading render path — verify this is intentional.

discourse_expires_in 1.minute on Line 19 runs unconditionally, including when the topic hasn't been fetched yet and the loading view is rendered. This means the "loading" response could be cached for up to 1 minute, delaying the user from seeing the actual content once the background job completes. Consider moving the cache directive inside the if topic_id branch, or using a much shorter expiry for the loading path.

Proposed fix
   def best
     embed_url = params.require(:embed_url)
     topic_id = TopicEmbed.topic_id_for_embed(embed_url)
 
     if topic_id
       `@topic_view` = TopicView.new(topic_id, current_user, {best: 5})
+      discourse_expires_in 1.minute
     else
       Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
       render 'loading'
     end
-
-    discourse_expires_in 1.minute
   end
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
@topic_view = TopicView.new(topic_id, current_user, {best: 5})
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
discourse_expires_in 1.minute
end
def best
embed_url = params.require(:embed_url)
topic_id = TopicEmbed.topic_id_for_embed(embed_url)
if topic_id
`@topic_view` = TopicView.new(topic_id, current_user, {best: 5})
discourse_expires_in 1.minute
else
Jobs.enqueue(:retrieve_topic, user_id: current_user.try(:id), embed_url: embed_url)
render 'loading'
end
end
🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 8 - 20, The cache header
discourse_expires_in 1.minute is applied unconditionally in the best action,
causing the loading view to be cached; move the discourse_expires_in call into
the branch where TopicEmbed.topic_id_for_embed(embed_url) returns a topic_id
(i.e., inside the if topic_id block, after assigning `@topic_view`) so only the
real topic response is cached, or alternatively set a much shorter expiry when
rendering 'loading' in the else branch (where Jobs.enqueue(:retrieve_topic, ...)
and render 'loading' occur) to avoid caching the loading page for a full minute.

Comment on lines +24 to +31
def ensure_embeddable
raise Discourse::InvalidAccess.new('embeddable host not set') if SiteSetting.embeddable_host.blank?
raise Discourse::InvalidAccess.new('invalid referer host') if URI(request.referer || '').host != SiteSetting.embeddable_host

response.headers['X-Frame-Options'] = "ALLOWALL"
rescue URI::InvalidURIError
raise Discourse::InvalidAccess.new('invalid referer host')
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

X-Frame-Options: ALLOWALL is not a standard directive; referer check is easily bypassed.

Two observations on ensure_embeddable:

  1. Line 28: ALLOWALL is not a valid X-Frame-Options value (only DENY, SAMEORIGIN, and ALLOW-FROM are defined in RFC 7034). Browsers will ignore the unrecognized value, which effectively allows framing—so the intent works, but only by accident. Consider omitting the header entirely or using Content-Security-Policy: frame-ancestors * for explicit semantics.

  2. Line 26: The Referer-based host check is trivially spoofable and will also reject legitimate requests from browsers/proxies that strip the Referer header (privacy settings, HTTPS→HTTP transitions). This is acceptable as a soft hotlink guard for public content, but should not be relied upon as a security boundary.

🤖 Prompt for AI Agents
In `@app/controllers/embed_controller.rb` around lines 24 - 31, The
ensure_embeddable method currently sets a non-standard X-Frame-Options value and
enforces a brittle Referer host check; remove the
response.headers['X-Frame-Options'] = "ALLOWALL" line (do not emit an invalid
X-Frame-Options value) and instead, if you need explicit framing rules, set a
proper Content-Security-Policy frame-ancestors header or omit frame headers
entirely; also make the referer validation (the URI(request.referer || '').host
comparison against SiteSetting.embeddable_host in ensure_embeddable) a soft
check—treat missing/invalid Referer as allowed (or log and continue) rather than
raising Discourse::InvalidAccess, since Referer is spoofable/omitted and should
not be a hard security boundary.

return if user.blank?

require 'simple-rss'
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

open() on admin-supplied URL — use URI.open to avoid Kernel#open command injection.

Brakeman's flag is valid here. In Ruby versions before 3.0, Kernel#open interprets strings starting with | as shell commands. While SiteSetting.feed_polling_url is admin-controlled, using URI.open (or OpenURI.open_uri) makes the intent explicit and eliminates the risk entirely.

Proposed fix
-      rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
+      rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
rss = SimpleRSS.parse open(SiteSetting.feed_polling_url)
rss = SimpleRSS.parse URI.open(SiteSetting.feed_polling_url)
🧰 Tools
🪛 Brakeman (8.0.1)

[medium] 29-29: Model attribute used in file name
Type: File Access
Confidence: Medium
More info: https://brakemanscanner.org/docs/warning_types/file_access/

(File Access)

🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` at line 29, Replace the unsafe Kernel#open
call in the SimpleRSS.parse call with an explicit URI.open/OpenURI.open_uri call
to avoid command injection; update the line using SimpleRSS.parse
open(SiteSetting.feed_polling_url) to use URI.open(SiteSetting.feed_polling_url)
(or OpenURI.open_uri) and add require 'open-uri' at the top of poll_feed.rb if
not already present so the intent is explicit and Brakeman warnings are
resolved.

Comment on lines +35 to +36
content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

i.content may be nil for feed items without a <content> element, causing NoMethodError on .scrub.

Not all RSS/Atom items have a content field (some only have description or summary). When i.content is nil, calling .scrub on it will raise NoMethodError. Add a nil guard or fall back to i.description / i.summary.

Proposed fix
-        content = CGI.unescapeHTML(i.content.scrub)
+        raw_content = i.content || i.description || i.summary || ''
+        content = CGI.unescapeHTML(raw_content.scrub)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
content = CGI.unescapeHTML(i.content.scrub)
TopicEmbed.import(user, url, i.title, content)
raw_content = i.content || i.description || i.summary || ''
content = CGI.unescapeHTML(raw_content.scrub)
TopicEmbed.import(user, url, i.title, content)
🤖 Prompt for AI Agents
In `@app/jobs/scheduled/poll_feed.rb` around lines 35 - 36, i.content may be nil
causing .scrub to raise; update the extraction before calling CGI.unescapeHTML
and TopicEmbed.import to guard against nil by selecting the first available body
(prefer i.content, fallback to i.description, then i.summary), call .to_s or
ensure non-nil before .scrub, and then pass the sanitized content variable into
TopicEmbed.import(user, url, i.title, content). Ensure the symbol references
i.content, i.description, i.summary, .scrub, CGI.unescapeHTML, and
TopicEmbed.import are used so the change is applied at the correct spot.

Comment on lines +6 to +11
<script>
(function() {
setTimeout(function() {
document.location.reload();
}, 30000);
})();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unbounded auto-reload loop — no retry limit.

If the embedded content never becomes available, this page will reload every 30 seconds indefinitely, wasting bandwidth and server resources. Consider adding a retry counter (e.g., via a query parameter or sessionStorage) and stopping after a reasonable number of attempts.

💡 Sketch of a bounded retry
  <script>
    (function() {
+     var maxRetries = 10;
+     var retryCount = parseInt(sessionStorage.getItem('embed_retry') || '0', 10);
+     if (retryCount >= maxRetries) {
+       sessionStorage.removeItem('embed_retry');
+       return;
+     }
+     sessionStorage.setItem('embed_retry', retryCount + 1);
      setTimeout(function() {
        document.location.reload();
      }, 30000);
    })();
  </script>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<script>
(function() {
setTimeout(function() {
document.location.reload();
}, 30000);
})();
<script>
(function() {
var maxRetries = 10;
var retryCount = parseInt(sessionStorage.getItem('embed_retry') || '0', 10);
if (retryCount >= maxRetries) {
sessionStorage.removeItem('embed_retry');
return;
}
sessionStorage.setItem('embed_retry', retryCount + 1);
setTimeout(function() {
document.location.reload();
}, 30000);
})();
</script>
🤖 Prompt for AI Agents
In `@app/views/embed/loading.html.erb` around lines 6 - 11, The page currently
does an unbounded reload via the IIFE that calls setTimeout ->
document.location.reload(); modify this logic to implement a bounded retry
counter (e.g., stored in sessionStorage under a key like "embedReloadAttempts"
or encoded as a query parameter) so the IIFE reads the counter, increments it on
each scheduled reload, and only calls document.location.reload() while attempts
< MAX_ATTEMPTS (then stop or show an error). Ensure the same unique symbols are
updated: the self-invoking function, the setTimeout invocation, and the
document.location.reload() call so the reload only happens up to the configured
limit and the counter is persisted across reloads.

Comment on lines +6 to +15
<script>
(function() {
window.onload = function() {
if (parent) {
// Send a post message with our loaded height
parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>');
}
}
})();
</script>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

request.referer injected into JS context — potential XSS and nil risk.

request.referer is user-controlled and is being interpolated directly into a JavaScript string literal. While Rails' <%= %> HTML-escapes the output, HTML escaping is not the correct encoding for JavaScript string contexts. Use escape_javascript (or the j helper) to properly escape for JS.

Additionally, if request.referer is nil (Referer header can be absent), the output will be an empty string, causing postMessage to fail.

🐛 Proposed fix
-           parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, '<%= request.referer %>');
+           var ref = '<%= j(request.referer || "*") %>';
+           parent.postMessage({type: 'discourse-resize', height: document['body'].offsetHeight}, ref);

Note: Using "*" as the fallback target origin allows any origin to receive the message. If this is too permissive, consider deriving the origin from the embeddable_host site setting instead.

🤖 Prompt for AI Agents
In `@app/views/layouts/embed.html.erb` around lines 6 - 15, The template is
interpolating request.referer directly into the JS postMessage call which risks
XSS and breaks when nil; update the embed.html.erb code that calls
parent.postMessage to JS-escape the referer using Rails' escape_javascript (or
the j helper) when building the target origin and provide a sensible fallback
when request.referer is nil (for example use "*" or derive origin from the
embeddable_host site setting) so parent.postMessage receives a valid,
safely-escaped origin string.

@@ -0,0 +1,13 @@
class CreateTopicEmbeds < ActiveRecord::Migration
def change
create_table :topic_embeds, force: true do |t|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

force: true will silently drop the table if it already exists — risky for production.

If this migration is ever re-run or the table exists from a prior partial deploy, force: true destroys all existing data without warning. Remove it to let the migration fail safely if the table already exists.

🐛 Proposed fix
-   create_table :topic_embeds, force: true do |t|
+   create_table :topic_embeds do |t|
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
create_table :topic_embeds, force: true do |t|
create_table :topic_embeds do |t|
🤖 Prompt for AI Agents
In `@db/migrate/20131217174004_create_topic_embeds.rb` at line 3, The migration
currently uses create_table :topic_embeds, force: true which will drop an
existing table silently; remove the force: true option from the create_table
call (so that create_table :topic_embeds do |t| ...) to prevent destructive
automatic drops and allow the migration to fail if the table already exists; if
you want safe idempotence instead, replace the force option by guarding with a
conditional check like table_exists?(:topic_embeds) or wrap the creation in a
conditional, updating the migration method that contains create_table
:topic_embeds to ensure it no longer uses force: true.

@@ -1,6 +1,6 @@
class CreateTopTopics < ActiveRecord::Migration
def change
create_table :top_topics do |t|
create_table :top_topics, force: true do |t|

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

force: true silently drops the table if it already exists — risk of data loss.

Adding force: true to create_table in a migration means re-running this migration (e.g., db:migrate:redo) will drop and recreate top_topics, destroying all rows. While top_topics is a derived/cache table and can be recomputed, this is a dangerous pattern in migrations. If this was added to work around an "already exists" error, prefer a guard like create_table ... unless table_exists?(:top_topics) or fix the root cause.

🤖 Prompt for AI Agents
In `@db/migrate/20131223171005_create_top_topics.rb` at line 3, The migration
currently passes force: true to create_table :top_topics which will silently
drop and recreate the table (risking data loss); remove force: true and instead
guard creation (e.g., check table_exists?(:top_topics) and only call
create_table when it does not exist, or use create_table with if_not_exists:
true if your Rails version supports it), keeping the table name/top_topics and
the existing create_table block unchanged otherwise.

Comment on lines +27 to +30
if $redis.setnx(retrieved_key, "1")
$redis.expire(retrieved_key, 60)
return false
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Race condition: SETNX + EXPIRE is not atomic — key can persist forever if the process crashes between the two calls.

If the process dies after setnx succeeds but before expire runs, the key will have no TTL and the URL will be permanently throttled. Use a single atomic SET with EX and NX flags.

Proposed fix
-      if $redis.setnx(retrieved_key, "1")
-        $redis.expire(retrieved_key, 60)
-        return false
-      end
+      return false if $redis.set(retrieved_key, "1", ex: 60, nx: true)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if $redis.setnx(retrieved_key, "1")
$redis.expire(retrieved_key, 60)
return false
end
return false if $redis.set(retrieved_key, "1", ex: 60, nx: true)
🤖 Prompt for AI Agents
In `@lib/topic_retriever.rb` around lines 27 - 30, Replace the non-atomic two-step
lock (calling $redis.setnx(retrieved_key, "1") then $redis.expire(retrieved_key,
60)) with a single atomic Redis SET using the EX and NX options so the TTL is
applied atomically; locate the block around the retrieved_key check in
lib/topic_retriever.rb and change the logic to use $redis.set with EX:60 and NX
semantics (treat success the same as setnx returning true) so the key cannot be
left without a TTL if the process crashes.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant